TransMiner: Mining Transitive Associations among Biological Objects from Medline
نویسندگان
چکیده
Summary: Associations among biological objects such as genes, proteins, and drugs can be discovered automatically from scientific literature. This paper describes ‘TransMiner’ a system for finding interesting associations among objects by mining the Medline database of scientific literature. The direct associations among the objects are discovered based on the principle of cooccurrence from the text documents. The principle of transitive closure is applied to the association graph to find potential transitive associations. The potential transitive associations that are indeed, direct are discovered by iterative retrieval and mining of the Medline database. Those associations that are not found explicitly in the entire Medline database are transitive associations and are candidates for hypotheses generation. All the discovered direct associations were manually evaluated. The direct and transitive associations are visualized using a graph visualization applet for use by the scientists. TransMiner was tested by finding associations among 56 breast cancer genes and by finding associations among 24 objects in calpain signal transduction pathway. Out of 413 direct associations discovered by TransMiner among 56 gene symbols involved in breast cancer, 329 direct associations (79.66%) were found to have some valid biological association. Out of 159 direct associations discovered by TransMiner among 24 objects involved in calpain signal transduction pathway, 155 direct associations (97.48%) were found to have some valid biological association. Availability: Graph visualization applets and result tables are available at http://sifter.cs.iupui.edu/~sifter/transMiner
منابع مشابه
Identification of Biological Relationships from Text Documents
s were used. Figure 16-1. Graph showing relationships between genes in Known Pathway. The higher the Association strength the closer the genes appear on the graph. In this way the related genes are clustered together and can be picked out. A graphical presentation of the unknown pathway (Table 16-2) is shown in Figure 16-1. The relationship discovery aspect of this method was excellent. This wa...
متن کاملA New Text Mining Approach for Finding Protein-to-Disease Associations
Discovering significant relationships between biological entities from text documents is an important task for biologists in order to develop biological models for research and discovery, especially with the existing gigantic amounts of biomedical documents and the rate at which they are increasing everyday. We propose a new text mining method to extract associations between biological entities...
متن کاملMining significant substructure-substructure pairs in structural associations
Biological and chemical objects in bioinformatics are often linked to structures: proteins and DNAs to their sequences, small-molecule compounds to their chemical structures, and pathways to their network structures. Associations among these objects, such as interactions between proteins and compounds, are of particular interest in the field, and these associations are often given as a graph: t...
متن کاملAn Efficient Computation of Reachability Labeling for Social Networking Using Graph Pattern Mining: an Application of Data Mining
Graphs form a powerful modeling tool to represent complex relationships among objects in an effective manner. Graph pattern matching is one of the areas of data mining where the data is stored in the form of graphs and the set of tuples that match a user-given graph pattern are extracted. For finding the set of matching tuples faster, all the possible paths in the large directed graph, i.e., tr...
متن کاملAn online literature mining tool for protein phosphorylation
A web-based version of the RLIMS-P literature mining system was developed for online mining of protein phosphorylation information from MEDLINE abstracts. The online tool presents extracted phosphorylation objects (phosphorylated proteins, phosphorylation sites and protein kinases) in summary tables and full reports with evidence-tagged abstracts. The tool further allows mapping of phosphorylat...
متن کامل